hysop.backend.device.opencl.opencl_autotunable_kernel module¶
- class hysop.backend.device.opencl.opencl_autotunable_kernel.OpenClAutotunableKernel(cl_env, typegen, build_opts, autotuner_config, **kwds)[source]¶
Bases:
AutotunableKernel
- autotune(name, force_verbose=False, force_debug=False, **extra_kwds)[source]¶
Autotune this kernel with given name and extra_kwds.
- check_cache(required_cache_size)[source]¶
Check that required_cache_size bytes can fit in workgroup cache.
- classmethod check_field(field, *args, **kwds)[source]¶
Extend AutotunableKernel.check_field() by checking that the field is defined on backend OPENCL.
- classmethod check_fields(*fields, **kwds)[source]¶
Extend AutotunableKernel.check_fields() by checking that all fields are defined on backend OPENCL.
- compute_global_work_size(work, local_work_size, extra_parameters, extra_kwds)[source]¶
Compute aligned global_work_size from unaligned global_work_size and local_work_size. Input global_work_size may be None.
- format_best_candidate(autotuner, file_basename, from_cache, name, extra_kwds, extra_parameters, work_size, work_load, global_work_size, local_work_size, args_mapping, args_list, program, kernel, kernel_name, kernel_src, kernel_statistics, src_hash, extra_kwds_hash, extra_kwds_hash_logs)[source]¶
Post treatment callback for autotuner results. Transform autotuner results in user friendly kernel wrappers.
Return a OpenClKernel with default_queue and default_args set to None. Only default_global_size, default_local_size, and args_mapping are set.
Use the build_launcher method to build OpenClKernelLauncher from this OpenClKernel.
- abstract generate_kernel_src(global_work_size, local_work_size, extra_parameters, extra_kwds, tuning_mode, dry_run)[source]¶
Generate kernel source code as a string.
Returns opencl known arguments as a dictionnary for codegen capabilities.
- generate_oclgrind_isolation_file(kernel, kernel_name, kernel_source, global_work_size, local_work_size, args_list, args_mapping, isolation_params, force=False)[source]¶
- make_array_strides(dim, hardcode_arrays)[source]¶
Build array strides in number of elements instead of bytes.
- max_device_work_group_size()[source]¶
Return the maximum number of work items allowed by the device.